Pesquisa | Portal Regional da BVS

Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training.

Shin, Dongyeob; Kim, Geonho; Jo, Joongho; Park, Jongsun.

IEEE Trans Neural Netw Learn Syst ; 34(9): 5745-5759, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-34890336

RESUMO

Deep neural network (DNN) training is an iterative process of updating network weights, called gradient computation, where (mini-batch) stochastic gradient descent (SGD) algorithm is generally used. Since SGD inherently allows gradient computations with noise, the proper approximation of computing weight gradients within SGD noise can be a promising technique to save energy/time consumptions during DNN training. This article proposes two novel techniques to reduce the computational complexity of the gradient computations for the acceleration of SGD-based DNN training. First, considering that the output predictions of a network (confidence) change with training inputs, the relation between the confidence and the magnitude of the weight gradient can be exploited to skip the gradient computations without seriously sacrificing the accuracy, especially for high confidence inputs. Second, the angle diversity-based approximations of intermediate activations for weight gradient calculation are also presented. Based on the fact that the angle diversity of gradients is small (highly uncorrelated) in the early training epoch, the bit precision of activations can be reduced to 2-/4-/8-bit depending on the resulting angle error between the original gradient and quantized gradient. The simulations show that the proposed approach can skip up to 75.83% of gradient computations with negligible accuracy degradation for CIFAR-10 dataset using ResNet-20. Hardware implementation results using 65-nm CMOS technology also show that the proposed training accelerator achieves up to 1.69× energy efficiency compared with other training accelerators.

Exploiting Retraining-Based Mixed-Precision Quantization for Low-Cost DNN Accelerator Design.

Kim, Nahsung; Shin, Dongyeob; Choi, Wonseok; Kim, Geonho; Park, Jongsun.

IEEE Trans Neural Netw Learn Syst ; 32(7): 2925-2938, 2021 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-32745007

RESUMO

For successful deployment of deep neural networks (DNNs) on resource-constrained devices, retraining-based quantization has been widely adopted to reduce the number of DRAM accesses. By properly setting training parameters, such as batch size and learning rate, bit widths of both weights and activations can be uniformly quantized down to 4 bit while maintaining full precision accuracy. In this article, we present a retraining-based mixed-precision quantization approach and its customized DNN accelerator to achieve high energy efficiency. In the proposed quantization, in the middle of retraining, an additional bit (extra quantization level) is assigned to the weights that have shown frequent switching between two contiguous quantization levels since it means that both quantization levels cannot help to reduce quantization loss. We also mitigate the gradient noise that occurs in the retraining process by taking a lower learning rate near the quantization threshold. For the proposed novel mixed-precision quantized network (MPQ-network), we have implemented a customized accelerator using a 65-nm CMOS process. In the accelerator, the proposed processing elements (PEs) can be dynamically reconfigured to process variable bit widths from 2 to 4 bit for both weights and activations. The numerical results show that the proposed quantization can achieve 1.37 × better compression ratio for VGG-9 using CIFAR-10 data set compared with a uniform 4-bit (both weights and activations) model without loss of classification accuracy. The proposed accelerator also shows 1.29× of energy savings for VGG-9 using the CIFAR-10 data set over the state-of-the-art accelerator.

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA